<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
 "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en-US">
  <head>
    <title>phastregex.  Regular expression genome search with phastcon conservation scores.</title>
    <meta http-equiv="Content-Type" content ="text/html; charset=utf-8">
    <link rel="stylesheet" href="phastregexjw.css" type="text/css">
<!--MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMM Jason L Weirather MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMMMMMMMMMMMMNMMMMMMMMMho+hNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMMMMMMMMMMMNNNMMMMMMMNo+-ooo+hdhyyhhhdMMMMMMMMMMMMNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMMMMNNMMMNNddddddhdNhso+/.`  .:/o+~~-:/oo/oyhNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMMMNhyyyhso/oo+o/-+o/.``       -`` ` `     ``/yddmmNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMNdsoooo+/.+oo+-..+:.                        .:.-:odNmNMNNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMmysoo+oo/`-:-/o+/.``:/                        `    `+ysyNNMNNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMmyoo++:.++` `  ```     `                              `:+oyMMhmNNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMyooooo//+`                                              `/oNNmNmNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MNMMhsoooooo:`    `                                           /shdmysdNMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MNmdyoooooo+-                                                -+/oosooyNmhmNMNMNNNMMMMMMMMMMMMMMMMMMM-->
<!--Mmdsoooooooo/`                                               ```:/oosyoooshNdmddNNMMMMMMMMMMMMMMMMMM-->
<!--Mmhssooooo+o+`                                                 -.`./oo:/oosdhhshMMMMMMMMMMMMMMMMMMMM-->
<!--MNdyoooooo+oo```                                                   -++:+osysoosmMMMMMMMMMMMMMMMMMMMM-->
<!--mddsoooooooo++-` `                                                  `-+oshmyshmNMMMMMMMMMMMMMMMMMMMM-->
<!--mhhooooooo+/:-``                                                      +ossdmNNNmhhMMMNMMMMMMMMMMMMMM-->
<!--Nhsoooooooooo-                                                        /+/:/::/+osmmhymmMMMMMMMMMMMMM-->
<!--hooooooooo++/.                                                         `      `-+ssohNMMMMMMMMMMMMMM-->
<!--ysoooooooo/+-`                                                                   .:+ohdNMMMMMMMMMMMM-->
<!--soooooooo+.`                                                                       `/osyNMMMMMMMMMMM-->
<!--ooooooo+:~~                                                                        .oooosNMMMMMMMMMM-->
<!--ooooooo+`                                                                         .:+oooodMMMMMMMMMM-->
<!--oooooooo:`.`                                                                       oooooosMMMMMMMMMM-->
<!--oooooooo/-/.                                                                       -ooooooMMMMMMMMMM-->
<!--ooooo+ooo`.`                                                                     ``-oooooyMMMMMMMMMM-->
<!--ooooooooo+/.                                                                    .:++oooooyMMMMMMMMMM-->
<!--oooooooo+-   ````:-/:~~`````                                                     `/oooooodMMMMMMMMMM-->
<!--ooooooo+- `.:/soshhhso+o+//:`.`                                                   :o+ooosMMMMMMMMMMM-->
<!--oooooo+..:syssyosoo//o++++oso+/:~~.``                                             `.:ooohMMMMMMMMMMM-->
<!--ooooo++oyhsoo+/-..    `.-:/o+syhddso+/-`                                            `/oodMMMMMMMMMMM-->
<!--oooooossoo+/:`            ````:osdmmNMNh+-.`                                        `+ooyMMMMMMMMMMM-->
<!--oooooo/.`.`              ````.`.-+osyddmdyo+:                                      `.+oosNMMMMMMMMMM-->
<!--oooooo.   ``  .`..-::/o+++osoooosssoooooso+/-  -                         `..~~`-//++osdmhmMMMMMMMMMM-->
<!--oooooo+:./oo++ooossysdNNhyooyyoooossoooooo::-``.                    `.::/osydmdddddNNMMMMMMMMMMMMMMM-->
<!--ooooooooooooooohmMmo:/mMMMNMMN+yyooooooooooo+/.                 `:+ydNNMMMMNNNddddmNMMMMMMMMMMMMMMMM-->
<!--oooooooooooooymms+`  hMMMNhMMN: `:+oooooooooo+                 -+hNNNNmmdmysooo++/+ssydMMMMMMMMMMMMM-->
<!--::+oooooooooooosso:` :NMMMMMNh`    `:+/:///oo+:              .+oydysssyddNyydmdhssooooohMMMMMMMMMMMM-->
<!--/  -ooooooooooooooo+-`.sddhs:`  ``.-::/    +o:.             -oo+::+mMMmNMMN. .+mMNdysoodMMMMMMMMMMMM-->
<!---  .oooooooooooooooooo+:/-` .`.:+:+/+/-.  `o/               -oo+/`.mmMmNMMM-  `ymMMMhohNMMMMMMMMMMMM-->
<!--  ``/:~~-:/oooooooooooooo///+/:/:/oo+-   ./+-               +oooo+-+dmMMMNo  .s+mMMmyyMMMMMMMMMMMMMM-->
<!--`:`        `-/ooooooooooooo+/o//+/:.` ``~~.                 /oooo+:-://so-.-:+ydMMMMMMMMMMMMMMMMMMMM-->
<!--             `-/.-`-.~~-:...-::.``.  ``-:``   `             -ooooooooooooooodmMMMMMMdymMMMMMMMMMMMMM-->
<!--               ...`       `````          /:                 :ooooooooooooooosymMNmmhooyMMMMMMMMMMMMM-->
<!--                                                            :ooooooooooooooooooysoysoosNMMMMMMMMMMMM-->
<!--                                                            :ooooooooooooooooooooooooosMMMMMMMMMMMMM-->
<!--                                                            -oooo+`:+ooooooooooooooooooNMMMMMMMMMMMM-->
<!--                                                            .ooooo- `:/ooooooooooooooosMMMMMMMMMMMMM-->
<!--                                                            .oooo+`    `-/++oooooooooooNMMMMMMMMMMMM-->
<!---                                                           `ooooo.       ```-/.`:oooosMMMMMMMMMMMMM-->
<!--o-`                                  .``:-/.                 +ooo:`            ..-:ooohMMMMMMMMMMMMM-->
<!--/-`                                `:+:-```                 :oooo.`              :ooooNMMMMMMMMMMMMM-->
<!--/+.`                      ``       /o/                      `-ooo+:              -oooyMMMMMMMMMMMMMM-->
<!--o+/+/-.`                ``.`       :o-                       -oooo/.`            /ooyMMMMMMMMMMMMMMM-->
<!--oo/+/o+` `           `++-          `+/`                      -oooo-+`          `:oosNMMMMMMMMMMMMMMM-->
<!--oo+-.:~~             ```            `/+/.```````             :oooo.::-`      `:/oosNMMMMMMMMMMMMMMMM-->
<!--ooo+~~``                              ..-:///:++::           /oooo/ooo+-.- `.+ooosmMMMMMMMMMMMMMMMMM-->
<!--oooooo..                                  ``` `/so-        `:oyooooooooooo++oooodNMMMMMMMMMMMMMMMMMM-->
<!--oooo/o-``       `           `         `    .    +os+:.```.:+oooooooooooooooooosmMMMMMMMMMMMMMMMMMMMM-->
<!--oooooo+:`      -+`       ` `                   -::~~/ooosoooooooooo++ooooooooymMMMMMMMMMMMMMMMMMMMMM-->
<!--oooooooo+.`  `-+:-````````                     :`     `.+:/+oooooooooooooooodMMMMMMMMMMMMMMMMMMMMMMM-->
<!--oooooooo~~``  .+/++o+:/+/:.``````  `           `        -::/ooooooooooo+ooodMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--oooooooo+:.`   /+oooo++oshhhysooo++++//+//:/:::-.```.~~-++-/ooooooooooo/oymMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--ooooooooo/:-`  :++oo/-`.+osshmmmmh-.~~~~:://++++o+/+oysssssysyoooooo+oooyNMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--soooooooo+:.-``:o+-.    /ooooosyyhho:..      ```.-:/yhsssssysooooooooosmNMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--dysosoooooo/o+:/o/``    /ooooooooosssoo/:...-...-+ooooooooooooooooooohNMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--Nmyososoooooo+ooo+:``   `:ooooooo+/oooooooooooooooooooooooooooooooosdMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMNdyosooooooo+++o+:-     -oooooo/..-/+ooooo+:+ooo+oooooooooooooosyNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMNmhsyssoooooooooo/.   .` .ooooooo/-`.~~/oo:~~/+oooooooooooooossNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMmdddNhhsooooooo+/-`  `  /oooooooo+///+oo+o+oooooo+ooooooossmNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--NMMMMNNNMNhssoooooooo+/::- ``.:+ooooooooooooooooooooooooooooshmMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--sdMMMMMMMMMdhoooooooo+/:/-     `-/oooooooooooooo+/o/o+ooooosmMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--ooymMMMMMMMMmdhosooo+/:`:~~ .    .::/:::::/+++oo///:+ooooosdNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMNNNNNddy-->
<!--oooshNMMMMMMMMmysooo/-/..`` `               ``+oooo+oooosshMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMNNmNNmmmmNN-->
<!--oooooshNMMMMMMMNysoo++/:.`.````              `/+oooooooyssNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--oooooooymMMMMMMMNsso+++/~~:.~~                :/oooooosyyNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--oooooooooyNMMMNMMhy+++/-`.``-`` `             ..oooshhshmMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--oooooooo+osdNMMMMMdsooo/:-`:::-.   `       `-.-+oosmmMddMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--ooooooooooo+sdMMMMNdso++/-://o+/-..  . `:::/+:/oosydNMMNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--ooooooooooooooyNMMMMNysoo/++/+oooo/~~:-+ooooooosshmNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--y-+ooooooooooooohmMMMMmddsso+oo+o+oooooo+osooshhNNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--Mh:+ooooooooooooooydNMMMMNmhddhdhdhhhyshyhNdmNMMNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMs/ooo+/++/ooooooooyhmNNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMdo//` `-:/+ooooooooooosyhhddddmmmmmmmmNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMNd:```~~~~o+/oooooooo+ooooooooo+oooymMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMs.`.-:-:/:/:++:./+o-:/++:://oosmMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMMd/```../- .`  `-.  ..`    -hMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMMMMhy-//+-          `     -mMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMMMMMh:/.`-`              +NMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMMMMMMMMNs-`             oMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMMMMMMMMMMMN+           oMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->
<!--MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM-->  
</head>
  <body>
    <div id="main_page_one">
      <div id="top_title">Find a regular expression in a sequence and look for conservation.</div>
      <div id="main_logo_block"></div>
      <div id="input_title">HELP</div>
      <div id="faq_title">FAQ</div>
      <div class="question">What is a regular expression?</div>
      <div class="answer">A regular expression is a flexible and concise means of matching strings of text.<a class="reflink" href="http://en.wikipedia.org/wiki/Regular_expression">wikipedia:Regular_expression</a></div>
      <div class="question">Why use a regular expression on genomic sequence?</div>
      <div class="answer">Regular expressions are useful for searching for a degenerate sequence across a large stretch of genetic sequence, especially when:<br>
                         1. A sequence is too short to reliably use more powerful alignment tools like BLAST or BLAT.<br>
                         2. A the degeneracy of a sequence results in too many possible motifs to resonably search of individually.<br>
                         3. A flexible definition of a sequence is needed to allow both nucleotide variablility with optional presence or absence of nucleotides.</div>
      <div class="question">Why are there two inputs for regular expressions?</div>
      <div class="answer">You could definetly define both forward and reverse directions of a sequence match with one regular expression.  However, I find the syntax a little simpler when they are in two statements.  The page will work either way, however if you want them included in the same table and not two different tables, you should combine them into one regular expression.</div>
      <div class="question">How do I write a regular expression?</div>
      <div class="answer">The regular expressions implemented here are in Perl Regular Expression format.<a class="reflink" href="http://perldoc.perl.org/perlre.html">perldoc:perlre</a>  This format is a personal preference but the syntax is addequate to accomplish the task.<br>
                          You must include delimiters on either side of your regular expression.  I use the typical forward slash delimiters on either side.<br>
                          So a very basic example of this would be to find all occurances of the string <b>ATGCCTTTA</b>, I would use the regular expression <b>/ATGCCTTTA/</b><br>
                          <i>Advanced note:  There is no need to include a global tag 'g' after your regular expression, this program will automatically find ALL occurances of your sequence</i><br>
                          All characters in the databases should be in upper case however if for some reason you need to make a search case insensative, you could append the letter 'i' as <b>/ATGCCTTTA/i</b><br>
                          So if you search for a genomic location chrZ:11000-11200 with the sequence GCATCGACATTCAGTCA<b>ATGCCTTTA</b>GTCAGTACATCCT<b>ATGCCTTTA</b>AGCTCGTAACGTACGTC with the regular expression <b>/ATGCCTTTA/</b> you will get two matches in the output.<br>
                          Now, if you want to recognize the sequence ATGCC<b>T</b>TTA or ATGCC<b>A</b>TTA you could define your search to be as such with a regular expression <b>/ATGCC[TA]TTA/</b>.  The <b>[TA]</b> now tells the regular expression that either the letter T or the letter A would be a match.<br>
                          If you want to match characters.  For example the sequence AT<b>G</b>CCTTTA could have either the single <b>G</b> or up to 3 <b>G</b>'s, you can use the regular expression <b>/ATG{1,3}CCTTTA/</b>.  The curly brackets tell the regular expression that for the previous character (<b>G</b> in this case) you could have any where from 1 to 3 occurances.<br>
      </div>
      <div class="question">What is the conservation being displayed?</div> 
      <div class="answer">This is the score displayed in the UCSC genome browser.<a class="reflink" href="http://genome.ucsc.edu/">http://genome.ucsc.edu/</a>  It is called a phastcon score rounded to a nearest estimate for ease of storage (either no conservation, 0.1, 0.5, or 0.9).  This score is useful if you are interested in knowing if the short sequence identified could have an important conserved biological function.</div>
      <div class="question">What is the gene features displayed?</div>
      <div class="answer">This again is data from UCSC, specifcally, the refGene tables from their databases.  These were used to infer whether or not the match is upstream or downstream of a gene, or if it is in a gene, and if it is in a gene, if it is in an exon or an intron.</div>
      <div class="question">How do I translate degenerate base pair definitions (nucleutide ambiguity codes) into regular expressions.?</div>
      <div class="answer">These are the IUPAC-IUB symbols for nucleotide nomenclature.<a class="reflink" href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC341218/">Cornish-Bowden. 1985.</a><br>
        <div>
        <table class="regex_table">
        <tr class="head_row"><td>abbrev</td><td>base</td><td>regex</td></tr>
        <tr><td>A</td><td>Adenosine</td><td>A</td></tr>
        <tr class="odd_row"><td>C</td><td>Cytosine</td><td>C</td></tr>
        <tr><td>G</td><td>Guanine</td><td>G</td></tr>
        <tr class="odd_row"><td>T</td><td>Thymine</td><td>T</td></tr>
        <tr><td>R</td><td>Purine (A or G)</td><td>[AG]</td></tr>
        <tr class="odd_row"><td>Y</td><td>Pyrimidine (C or T)</td><td>[CT]</td></tr>
        <tr><td>S</td><td>Strong (G or C)</td><td>[GC]</td></tr>
        <tr class="odd_row"><td>W</td><td>Weak (A or T)</td><td>[AT]</td></tr>
        <tr><td>M</td><td>Amino (A or C)</td><td>[AC]</td></tr>
        <tr class="odd_row"><td>K</td><td>Keto (T or G)</td><td>[TG]</td></tr>
        <tr><td>B</td><td>C,G,T (not A)</td><td>[CGT]</td></tr>
        <tr class="odd_row"><td>D</td><td>A,G,T (not C)</td><td>[AGT]</td></tr>
        <tr><td>H</td><td>A,C,T (not G)</td><td>[ACT]</td></tr>
        <tr class="odd_row"><td>V</td><td>A,C,G (not T)</td><td>[ACG]</td></tr>
        <tr><td>X/N</td><td>any base</td><td>.</td></tr>
        </table>
        </div>
      </div>
      <div id="reference">Jason Weirather 2010.</div>
    </div>
</div>
  </body>
</html> 

